Head Movement Improves Auditory Speech Perception

نویسندگان

  • K. G. Munhall
  • Jeffery A. Jones
  • Daniel E. Callan
  • Takaaki Kuratate
  • Eric Vatikiotis-Bateson
چکیده

People naturally move their heads when they speak, and our study shows that this rhythmic head motion conveys linguistic information. Three-dimensional head and face motion and the acoustics of a talker producing Japanese sentences were recorded and analyzed. The head movement correlated strongly with the pitch (fundamental frequency) and amplitude of the talker’s voice. In a perception study, Japanese subjects viewed realistic talking-head animations based on these movement recordings in a speech-in-noise task. The animations allowed the head motion to be manipulated without changing other characteristics of the visual or acoustic speech. Subjects correctly identified more syllables when natural head motion was present in the animation than when it was eliminated or distorted. These results suggest that nonverbal gestures such as head movements play a more direct role in the perception of speech than previously known. During natural face-to-face conversations, a wide range of visual information from the movements of the face, head, and hands is available to conversational partners. In the work reported here, we studied the impact on speech perception of watching one component of this rich visual stimulus—a talker’s head movements. It is well known that the intelligibility of degraded auditory speech is enhanced when listeners view a talker’s lip movements (Sumby & Pollack, 1954). Watching these lip movements can also influence the perception of perfectly audible speech (McGurk & MacDonald, 1976) or be the sole basis of speech perception (Bernstein, Demorest, & Tucker, 2000). The contribution of nonverbal gestures such as head movements to the understanding of spoken language, however, is not well understood. Previous work on head gestures during speech focused on documenting their timing and motor organization (Hadar, Steiner, Grant, & Rose, 1983, 1984; Hadar, Steiner, & Rose, 1984). These studies suggested that head movements are linked to the production of suprasegmental features of speech such as stress, prominence, and other aspects of prosody. Consistent with the kinematic research of Hadar et al., several studies have shown that subjects can use head and eyebrow movements to determine which word in a sentence is receiving emphatic stress (e.g., Bernstein, Eberhardt, & Demorest, 1998; Risberg & Lubker, 1978; Thompson, 1934) and to discriminate statements from questions (Bernstein et al., 1998; Fisher, 1969; Nicholson, Baum, Cuddy, & Munhall, 2002; Srinivasan & Massaro, 2002). In this study, we extended this research to test whether visual prosody, as embodied in head motion, plays a role in spoken word recognition. Auditory prosodic structure can aid the segmentation of words from the continuous stream of speech, as well as facilitate lexical access (see Cutler, Dahan, & van Donselaar, 1997, for a review). To examine whether visual prosody functions similarly, we tested whether the presence of visible head motion improved the intelligibility of Japanese sentences in a speech-in-noise task. In order to carry out this test, we created a stimulus set consisting of an animated talking face whose characteristics were initially derived directly from recordings of the face and head motion of a Japanese speaker (Kuratate, Yehia, & Vatikiotis-Bateson, 1998). The advantage of using animation is that head motion can be systematically varied independently of the acoustics and face motion in order to determine the influence of head motion on speech perception. Address correspondence to K.G. Munhall, Department of Psychology, Queen’s University, Kingston, ON, Canada K7L 3N6; e-mail: [email protected]. By nonverbal, we mean that the movements are not directly involved in the production of sound. We are not referring to symbolic gestures, such as head nodding to signify ‘‘yes’’ or head shaking to signify ‘‘no.’’ The motions that we tested are the arbitrary, rhythmic movements of the head that always accompany speech. PSYCHOLOGICAL SCIENCE Volume 15—Number 2 133 Copyright r 2004 American Psychological Society The existence of visual prosody effects would pose a number of challenges for current research on spoken word recognition. For example, the pool of potential cues for segmentation and lexical selection would have to be expanded to the visual modality; acoustic prosody is well documented, but its visual counterpart has been explored in only a preliminary manner. Study of the time course of word activation would have to reconcile differences in processing auditory and visual information at both neural and behavioral levels of observation. Finally, the neural substrates attributed to lexical processing would have to include the areas responsible for visual and audiovisual processing of prosody.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Correlation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants

Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...

متن کامل

Effect of Vowel Auditory Training on the Speech-In-Noise Perception among Older Adults with Normal Hearing

Introduction: Aging reduces the ability to understand speech in noise. Hearing rehabilitation is one of the ways to help older people communicate effectively. This study aimed to investigate the effect of vowel auditory training on the improvement of speech-in-noise (SIN) perception among elderly listeners.   Materials and Methods: This study was conducted on 36 elderly ...

متن کامل

نوروپاتی شنوایی در دو بیمار مبتلا به نوروپاتی عمومی: گزارش موردی

Background: Although it is not a new disorder, in recent times we have attained a greater understanding of auditory neuropathy (AN). In this type of hearing impairment, cochlear hair cells function but AN victims suffer from disordered neural transmission in the auditory pathway. The auditory neuropathy result profile often occurs as a part of that of the generalized neuropathic disorders, indi...

متن کامل

Lombard speech: Auditory (A), Visual (V) and AV effects

This study examined Auditory (A) and Visual (V) speech (speech-related head and face movement) as a function of noise environment. Measures of AV speech were recorded for 3 males and 1 female for 10 sentences spoken in quiet as well as four styles of background noise (Lombard speech). Auditory speech was analyzed in terms of overall intensity, duration, spectral tilt and prosodic parameters emp...

متن کامل

Speech development and auditory performance in children after cochlear implantation

 Abstract Background: The aim of this study was to determine the auditory performance of congenitally deaf children and the effect of cochlear implantation (CI) on speech intelligibility. Methods: Aprospective study was undertaken on 47 children in a pediatric tertiary referral center for CI. All children were deaf prelingually and were younger than 8 years of age. They were followed up until 5...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004